Name: JENNIFER AGU
Student_ID: 8882641

IntroductionΒΆ

Framing the ProblemΒΆ

This project's objective is to develop and evaluate a deep learning model for image classification into distinct categories. The model is trained using dataset with a collection of labeled images that must be processed before being fed into a neural network. Our goal is to investigate different methods to maximize the model's performance, such as training a custom CNN from scratch and transfer learning (using a pre-trained VGG16 model). We may evaluate the benefits of transfer learning over creating a model from scratch, especially for image classification tasks, by comparing the performance of these two models.

Import the required python LibrariesΒΆ

InΒ [67]:
import os, shutil, pathlib
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import classification_report, confusion_matrix, precision_recall_curve
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import image_dataset_from_directory
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.applications import VGG16
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from collections import Counter

1. Obtain the Data: Get the Dogs vs Cats datasetΒΆ

InΒ [68]:
# This should point to the small dataset of the Kaggle Dogs vs Cats competition that was created in a previous notebook
data_folder = pathlib.Path('../data/kaggle_dogs_vs_cats_small')

# Load the train dataset fron the data_folder/train folder path, resize the image to a 180 x 180 pixels and then group them in a batch size of 32
train_dataset = image_dataset_from_directory(data_folder / "train", image_size=(180, 180), batch_size=32)

# Load the validation dataset fron the data_folder/validation folder path, resize the image to a 180 x 180 pixels and then group them in a batch size of 32
validation_dataset = image_dataset_from_directory( data_folder / "validation", image_size=(180, 180), batch_size=32)

# Load the test dataset fron the data_folder/test folder path, resize the image to a 180 x 180 pixels and then group them in a batch size of 32
test_dataset = image_dataset_from_directory(data_folder / "test",image_size=(180, 180),batch_size=32)
Found 2000 files belonging to 2 classes.
Found 1000 files belonging to 2 classes.
Found 2000 files belonging to 2 classes.

2. EDA: Explore the data with relevant graphs, statistics and insightsΒΆ

  • Explore the data and label batch shape for the train, test and validation dataset.
InΒ [69]:
print('Train Data:')
for data_batch, labels_batch in train_dataset:
    print("data batch shape for the train data:", data_batch.shape)
    print("labels batch shape for the train data:", labels_batch.shape)
    break

print('\nValidation Data:')
for data_batch, labels_batch in validation_dataset:
    print("data batch shape for the Validation Data :", data_batch.shape)
    print("labels batch shape for the Validation Data:", labels_batch.shape)
    break 

print('\nTest Data:')
for data_batch, labels_batch in test_dataset:
    print("data batch shape for the test Data:", data_batch.shape)
    print("labels batch shape for the test Data:", labels_batch.shape)
    break
Train Data:
data batch shape for the train data: (32, 180, 180, 3)
labels batch shape for the train data: (32,)

Validation Data:
data batch shape for the Validation Data : (32, 180, 180, 3)
labels batch shape for the Validation Data: (32,)

Test Data:
data batch shape for the test Data: (32, 180, 180, 3)
labels batch shape for the test Data: (32,)

Explain: This shows that the train, validation and test dataset have a data batch size of (32, 180, 180, 3) which means the each batch have 32 images, 180 by 180 pixels size and the 3 which is the number of channels that is the image color are Red, Green, and Blue (RGB). Also the labels shape of 32 indicates one for each of the image in the batch which are 0 and 1 depending on whether its cat or dog.

  • Explore the images in the train dataset
InΒ [70]:
# Define a function to plot different sample images from the data set
def plot_sample_images(dataset, class_names):
    plt.figure(figsize=(10, 10))

    # Consider a single batch from the dataset
    for images, labels in dataset.take(1): 
        # Loop through the first 9 images in the batch
        for i in range(9):
            ax = plt.subplot(3, 3, i + 1)
            plt.imshow(images[i].numpy().astype("uint8"))                   # Display the image after converting it to the right format
            plt.title(f"Label: {class_names[int(labels[i])]}")              # Set the the title of the plot using the class name
            plt.axis("off")                                                 # Hide the axis
    plt.tight_layout()
    plt.show()                                                              # Display the plot

class_names = train_dataset.class_names                                     # Get the class names from the train dataset
plot_sample_images(train_dataset, class_names)                              # Call the function to plot the sample images from the train dataset
No description has been provided for this image

Explain: As we can see, there is variation in pose, background, lighting, and image quality among the images, which are of various orientations and belong to both classes (dogs and cats). This variation helps in improving the model's ability to generalize across real-world situations.

  • Explore the distribution of classes (cats and dogs) in the training dataset
InΒ [71]:
# Define a function to count the number of samples in each of the classes
def get_label_distribution(dataset):
    label_counts = Counter()
    for _, labels in dataset.unbatch():
        label = int(labels.numpy())
        label_counts[label] += 1
    return label_counts

# Apply the function on the train dataset
label_dist = get_label_distribution(train_dataset)

# Map the label indices to their labels
label_names = [class_names[k] for k in label_dist.keys()]
label_values = list(label_dist.values())

# Plot the bar chart
plt.bar(label_names, label_values, color='skyblue')
plt.title("Class Distribution")
plt.xlabel("Class")
plt.ylabel("Number of Images")
plt.show()

# Display the counts of each labels
print(f"Label counts: {dict(zip(label_names, label_values))}")
No description has been provided for this image
Label counts: {'dog': 1000, 'cat': 1000}

Explain: We can see from this bar chart that the training dataset is balanced between the classes(cats and dogs) which is ideal for training a model because it will reduce the risk of the model being biased.

3. Train two networksΒΆ

3.1 Define a Neural Network of your choiceΒΆ

Defining the modelΒΆ

Convolutional Neural Network(CNN) are a subset of deep learning models created especially to process image data. Tasks like object identification, face recognition, image classification, and more are frequently performed with it. Here we will be using it for classification of cats and dogs.

The model takes an input of 180 by 180 pixel images with 3 channels (RGB) and consists of:

  • 5 Convolutional layers with the filter size increasing from 32 to 256 to extract features from the images.
  • 4 MaxPooling layers with a pool_size of 2 which is after most convolutional layer. it is used to reduce the spatial dimensions (height and width) and the computation to help the model focus the the features that are importance.
  • Flatten layer does the work of converting the 2D feature to a 1D vector to prepare the data for the classification layer.
  • The Dense output layer has a single neuron using sigmoid for the activation
InΒ [72]:
inputs = keras.Input(shape=(180, 180, 3))
x = layers.Rescaling(1./255)(inputs)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
cnn_model = keras.Model(inputs=inputs, outputs=outputs)
InΒ [73]:
cnn_model.summary()
Model: "model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_4 (InputLayer)        [(None, 180, 180, 3)]     0         
                                                                 
 rescaling_1 (Rescaling)     (None, 180, 180, 3)       0         
                                                                 
 conv2d_5 (Conv2D)           (None, 178, 178, 32)      896       
                                                                 
 max_pooling2d_4 (MaxPooling  (None, 89, 89, 32)       0         
 2D)                                                             
                                                                 
 conv2d_6 (Conv2D)           (None, 87, 87, 64)        18496     
                                                                 
 max_pooling2d_5 (MaxPooling  (None, 43, 43, 64)       0         
 2D)                                                             
                                                                 
 conv2d_7 (Conv2D)           (None, 41, 41, 128)       73856     
                                                                 
 max_pooling2d_6 (MaxPooling  (None, 20, 20, 128)      0         
 2D)                                                             
                                                                 
 conv2d_8 (Conv2D)           (None, 18, 18, 256)       295168    
                                                                 
 max_pooling2d_7 (MaxPooling  (None, 9, 9, 256)        0         
 2D)                                                             
                                                                 
 conv2d_9 (Conv2D)           (None, 7, 7, 256)         590080    
                                                                 
 flatten_2 (Flatten)         (None, 12544)             0         
                                                                 
 dense_3 (Dense)             (None, 1)                 12545     
                                                                 
=================================================================
Total params: 991,041
Trainable params: 991,041
Non-trainable params: 0
_________________________________________________________________

Summary From this we could see that the total number of parameters is 991,041 and the trainable parameters is also 991,041.

We will be compling the CNN model using the binary crossentropy loss, RMSprop optimizer and we will use the accuracy as the evaluation metrics. We will also be using a callback to save the best model based the the validation loss. Then we will be training the model for 30 epochs using the train and validation dataset.
We will save the best-performing version in ./models/cnn_from_scratch.keras path. Store the training history to visualize or analyze later

InΒ [Β ]:
# Compile the CNN model
cnn_model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

# Define callbacks to enhance the training of the model
callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="./models/cnn_from_scratch.keras",
        save_best_only=True,
        monitor="val_loss")
]

# Train the CNN Model
history = cnn_model.fit(
    train_dataset,
    epochs=30,
    validation_data=validation_dataset,
    callbacks=callbacks)
Epoch 1/30
63/63 [==============================] - 69s 1s/step - loss: 0.7048 - accuracy: 0.5255 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 2/30
63/63 [==============================] - 62s 987ms/step - loss: 0.6968 - accuracy: 0.5280 - val_loss: 0.6868 - val_accuracy: 0.5070
Epoch 3/30
63/63 [==============================] - 59s 928ms/step - loss: 0.6901 - accuracy: 0.5655 - val_loss: 0.6700 - val_accuracy: 0.6040
Epoch 4/30
63/63 [==============================] - 57s 909ms/step - loss: 0.6561 - accuracy: 0.6275 - val_loss: 0.6727 - val_accuracy: 0.5860
Epoch 5/30
63/63 [==============================] - 59s 929ms/step - loss: 0.6399 - accuracy: 0.6395 - val_loss: 0.6365 - val_accuracy: 0.6340
Epoch 6/30
63/63 [==============================] - 59s 935ms/step - loss: 0.5953 - accuracy: 0.6855 - val_loss: 0.5945 - val_accuracy: 0.6680
Epoch 7/30
63/63 [==============================] - 75s 1s/step - loss: 0.5619 - accuracy: 0.7095 - val_loss: 0.6778 - val_accuracy: 0.6510
Epoch 8/30
63/63 [==============================] - 66s 1s/step - loss: 0.5324 - accuracy: 0.7350 - val_loss: 0.5754 - val_accuracy: 0.7180
Epoch 9/30
63/63 [==============================] - 63s 996ms/step - loss: 0.4820 - accuracy: 0.7565 - val_loss: 0.6545 - val_accuracy: 0.6660
Epoch 10/30
63/63 [==============================] - 60s 956ms/step - loss: 0.4356 - accuracy: 0.8035 - val_loss: 0.6981 - val_accuracy: 0.6490
Epoch 11/30
63/63 [==============================] - 61s 961ms/step - loss: 0.4028 - accuracy: 0.8220 - val_loss: 0.5764 - val_accuracy: 0.7200
Epoch 12/30
63/63 [==============================] - 60s 951ms/step - loss: 0.3468 - accuracy: 0.8570 - val_loss: 0.5783 - val_accuracy: 0.7220
Epoch 13/30
63/63 [==============================] - 63s 997ms/step - loss: 0.2905 - accuracy: 0.8695 - val_loss: 0.6532 - val_accuracy: 0.7380
Epoch 14/30
63/63 [==============================] - 62s 989ms/step - loss: 0.2288 - accuracy: 0.9050 - val_loss: 0.9340 - val_accuracy: 0.6930
Epoch 15/30
63/63 [==============================] - 62s 987ms/step - loss: 0.1818 - accuracy: 0.9300 - val_loss: 0.9695 - val_accuracy: 0.7040
Epoch 16/30
63/63 [==============================] - 60s 950ms/step - loss: 0.1400 - accuracy: 0.9465 - val_loss: 1.1002 - val_accuracy: 0.7060
Epoch 17/30
63/63 [==============================] - 64s 1s/step - loss: 0.1140 - accuracy: 0.9630 - val_loss: 1.1496 - val_accuracy: 0.7250
Epoch 18/30
63/63 [==============================] - 63s 1s/step - loss: 0.0981 - accuracy: 0.9645 - val_loss: 1.2450 - val_accuracy: 0.7340
Epoch 19/30
63/63 [==============================] - 68s 1s/step - loss: 0.0686 - accuracy: 0.9745 - val_loss: 1.6035 - val_accuracy: 0.6840
Epoch 20/30
63/63 [==============================] - 69s 1s/step - loss: 0.0748 - accuracy: 0.9710 - val_loss: 1.5866 - val_accuracy: 0.7120
Epoch 21/30
63/63 [==============================] - 77s 1s/step - loss: 0.0627 - accuracy: 0.9775 - val_loss: 1.3751 - val_accuracy: 0.7320
Epoch 22/30
63/63 [==============================] - 69s 1s/step - loss: 0.0881 - accuracy: 0.9745 - val_loss: 1.4056 - val_accuracy: 0.7280
Epoch 23/30
63/63 [==============================] - 64s 1s/step - loss: 0.0322 - accuracy: 0.9900 - val_loss: 1.8468 - val_accuracy: 0.7180
Epoch 24/30
63/63 [==============================] - 65s 1s/step - loss: 0.0786 - accuracy: 0.9765 - val_loss: 1.5832 - val_accuracy: 0.7320
Epoch 25/30
63/63 [==============================] - 65s 1s/step - loss: 0.0451 - accuracy: 0.9885 - val_loss: 1.9540 - val_accuracy: 0.7040
Epoch 26/30
63/63 [==============================] - 67s 1s/step - loss: 0.0529 - accuracy: 0.9810 - val_loss: 1.8376 - val_accuracy: 0.7160
Epoch 27/30
63/63 [==============================] - 65s 1s/step - loss: 0.0391 - accuracy: 0.9845 - val_loss: 1.9217 - val_accuracy: 0.7230
Epoch 28/30
63/63 [==============================] - 72s 1s/step - loss: 0.0561 - accuracy: 0.9815 - val_loss: 1.8028 - val_accuracy: 0.7290
Epoch 29/30
63/63 [==============================] - 91s 1s/step - loss: 0.0289 - accuracy: 0.9930 - val_loss: 1.9423 - val_accuracy: 0.7300
Epoch 30/30
63/63 [==============================] - 82s 1s/step - loss: 0.0220 - accuracy: 0.9930 - val_loss: 1.9190 - val_accuracy: 0.7350

Explain
Observing the result:
The training accuracy rate starts with 0.5255(approximately 53%) then increases to 0.9930 during epoch 30.
The validation accuracy was maximum at epoch 13 with a value of 0.7380.
The training loss decreases with time throughout.
The validation loss was lowest at epoch 8 with the value 0.5754.

InΒ [75]:
# Extract the metrics from the history
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)

# figure size for the subplots
plt.figure(figsize=(12, 5))

# Plot the Accuracy
plt.subplot(1, 2, 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()

# Plot the loss
plt.subplot(1, 2, 2)
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()

plt.tight_layout()
plt.show()
No description has been provided for this image

Explain: Observing this visualize graph we could see that overfitting is evident, around epoch 8 where the validation loss is the lowest.

3.2 Fine-Tune VGG16 (pre-trained on imagenet)ΒΆ

  • We will be loading the pretrained VGG16 model with ImageNet weights and exclude the top layers to be able to use the convolutional base as a feature extractor the the dataset
InΒ [76]:
# Load the VGG16 model without the top layer and with pretrained ImageNet weights
conv_base = keras.applications.vgg16.VGG16(
    weights="imagenet",
    include_top=False
)

# Print model summary
conv_base.summary()
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_5 (InputLayer)        [(None, None, None, 3)]   0         
                                                                 
 block1_conv1 (Conv2D)       (None, None, None, 64)    1792      
                                                                 
 block1_conv2 (Conv2D)       (None, None, None, 64)    36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, None, None, 64)    0         
                                                                 
 block2_conv1 (Conv2D)       (None, None, None, 128)   73856     
                                                                 
 block2_conv2 (Conv2D)       (None, None, None, 128)   147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, None, None, 128)   0         
                                                                 
 block3_conv1 (Conv2D)       (None, None, None, 256)   295168    
                                                                 
 block3_conv2 (Conv2D)       (None, None, None, 256)   590080    
                                                                 
 block3_conv3 (Conv2D)       (None, None, None, 256)   590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, None, None, 256)   0         
                                                                 
 block4_conv1 (Conv2D)       (None, None, None, 512)   1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, None, None, 512)   0         
                                                                 
 block5_conv1 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, None, None, 512)   0         
                                                                 
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

Summary: From this we could see that the total number of parameters is 14,714,688 and the trainable parameters is also 14,714,688.

Explain: We will be retrieving data features and labels from a dataset through application of the pretrained VGG16 model. We will apply VGG16-specific preprocessing to image batches afterward images move through the convolutional base for feature representation before saving corresponding labels. The function will be applied to each of the training, validation and test datasets to retrieve extracted features and labels.

InΒ [77]:
# Define a function to extract features and labels from the dataset using the VGG16 convolutional base
def get_features_and_labels(dataset):
    all_features = []
    all_labels = []

    # Loop through each batch of the images and labels in the dataset
    for images, labels in dataset:
        preprocessed_images = keras.applications.vgg16.preprocess_input(images)
        features = conv_base.predict(preprocessed_images)
        all_features.append(features)
        all_labels.append(labels)
    return np.concatenate(all_features), np.concatenate(all_labels)

# Extract features and labels for train, validation and test datasets
train_features, train_labels =  get_features_and_labels(train_dataset)
val_features, val_labels =  get_features_and_labels(validation_dataset)
test_features, test_labels =  get_features_and_labels(test_dataset)
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 7s 7s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 7s 7s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 7s 7s/step
1/1 [==============================] - 6s 6s/step
1/1 [==============================] - 7s 7s/step
1/1 [==============================] - 6s 6s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 1s 888ms/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 6s 6s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 5s 5s/step
1/1 [==============================] - 2s 2s/step

Defining the densely connected classifier

Summary: The weights of the pretrained VGG16 convolutional base will be frozen during training preventing them from updating. After that we will develop a custom classification model on top of it. The model uses a flattening layer followed by a dense hidden layer with ReLU activation and then adds a dropout layer for overfitting prevention and concludes with a sigmoid output layer for binary classification.

InΒ [78]:
# Freeze the convolutional base
conv_base.trainable = False

# Build the model on top of the conv base
inputs = keras.Input(shape=(180, 180, 3))
x = conv_base(inputs)
x = layers.Flatten()(x)
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)

vgg_model = keras.Model(inputs, outputs)

# Print model summary
vgg_model.summary()
Model: "model_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_6 (InputLayer)        [(None, 180, 180, 3)]     0         
                                                                 
 vgg16 (Functional)          (None, None, None, 512)   14714688  
                                                                 
 flatten_3 (Flatten)         (None, 12800)             0         
                                                                 
 dense_4 (Dense)             (None, 256)               3277056   
                                                                 
 dropout_1 (Dropout)         (None, 256)               0         
                                                                 
 dense_5 (Dense)             (None, 1)                 257       
                                                                 
=================================================================
Total params: 17,992,001
Trainable params: 3,277,313
Non-trainable params: 14,714,688
_________________________________________________________________

Explain: The model has a total of 17,992,001 parameters, out of which are 3,277,313 trainable parameters from the newly added layers with 14,714,688 non-trainable parameters from the frozen VGG16 base.

InΒ [Β ]:
# Compile the VGG16 model
vgg_model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

# Define callbacks to enhance the training of the model
callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="./models/vgg16_model.keras",
        save_best_only=True,
        monitor="val_loss")
]

# Train the VGG16 Model
history = vgg_model.fit(
    train_dataset,
    epochs=50,
    validation_data=validation_dataset,
    callbacks=callbacks)
Epoch 1/50
63/63 [==============================] - 438s 7s/step - loss: 7.9452 - accuracy: 0.8870 - val_loss: 1.3846 - val_accuracy: 0.9400
Epoch 2/50
63/63 [==============================] - 357s 6s/step - loss: 1.4545 - accuracy: 0.9475 - val_loss: 1.9854 - val_accuracy: 0.9030
Epoch 3/50
63/63 [==============================] - 382s 6s/step - loss: 0.5191 - accuracy: 0.9645 - val_loss: 0.6793 - val_accuracy: 0.9470
Epoch 4/50
63/63 [==============================] - 400s 6s/step - loss: 0.2682 - accuracy: 0.9815 - val_loss: 1.0961 - val_accuracy: 0.9340
Epoch 5/50
63/63 [==============================] - 370s 6s/step - loss: 0.2148 - accuracy: 0.9855 - val_loss: 0.7397 - val_accuracy: 0.9490
Epoch 6/50
63/63 [==============================] - 343s 5s/step - loss: 0.1540 - accuracy: 0.9845 - val_loss: 0.7253 - val_accuracy: 0.9580
Epoch 7/50
63/63 [==============================] - 402s 6s/step - loss: 0.1767 - accuracy: 0.9895 - val_loss: 0.8830 - val_accuracy: 0.9540
Epoch 8/50
63/63 [==============================] - 401s 6s/step - loss: 0.1423 - accuracy: 0.9900 - val_loss: 0.9885 - val_accuracy: 0.9570
Epoch 9/50
63/63 [==============================] - 365s 6s/step - loss: 0.2535 - accuracy: 0.9880 - val_loss: 0.8698 - val_accuracy: 0.9580
Epoch 10/50
63/63 [==============================] - 374s 6s/step - loss: 0.1115 - accuracy: 0.9925 - val_loss: 0.9336 - val_accuracy: 0.9590
Epoch 11/50
63/63 [==============================] - 315s 5s/step - loss: 0.1098 - accuracy: 0.9950 - val_loss: 1.0048 - val_accuracy: 0.9540
Epoch 12/50
63/63 [==============================] - 392s 6s/step - loss: 0.1236 - accuracy: 0.9920 - val_loss: 1.0349 - val_accuracy: 0.9530
Epoch 13/50
63/63 [==============================] - 399s 6s/step - loss: 0.0195 - accuracy: 0.9980 - val_loss: 1.1741 - val_accuracy: 0.9470
Epoch 14/50
63/63 [==============================] - 377s 6s/step - loss: 0.1208 - accuracy: 0.9945 - val_loss: 1.5136 - val_accuracy: 0.9370
Epoch 15/50
63/63 [==============================] - 375s 6s/step - loss: 0.0923 - accuracy: 0.9945 - val_loss: 1.3525 - val_accuracy: 0.9520
Epoch 16/50
63/63 [==============================] - 378s 6s/step - loss: 0.0270 - accuracy: 0.9975 - val_loss: 0.9302 - val_accuracy: 0.9600
Epoch 17/50
63/63 [==============================] - 371s 6s/step - loss: 0.0283 - accuracy: 0.9965 - val_loss: 1.0255 - val_accuracy: 0.9530
Epoch 18/50
63/63 [==============================] - 383s 6s/step - loss: 0.0317 - accuracy: 0.9950 - val_loss: 1.2045 - val_accuracy: 0.9560
Epoch 19/50
63/63 [==============================] - 382s 6s/step - loss: 0.1295 - accuracy: 0.9945 - val_loss: 1.6302 - val_accuracy: 0.9470
Epoch 20/50
63/63 [==============================] - 377s 6s/step - loss: 0.0280 - accuracy: 0.9960 - val_loss: 1.3293 - val_accuracy: 0.9540
Epoch 21/50
63/63 [==============================] - 367s 6s/step - loss: 0.0766 - accuracy: 0.9940 - val_loss: 0.9235 - val_accuracy: 0.9590
Epoch 22/50
63/63 [==============================] - 347s 6s/step - loss: 0.0347 - accuracy: 0.9990 - val_loss: 0.9677 - val_accuracy: 0.9620
Epoch 23/50
63/63 [==============================] - 340s 5s/step - loss: 0.0522 - accuracy: 0.9970 - val_loss: 0.9182 - val_accuracy: 0.9610
Epoch 24/50
63/63 [==============================] - 338s 5s/step - loss: 0.0012 - accuracy: 0.9995 - val_loss: 0.9561 - val_accuracy: 0.9600
Epoch 25/50
63/63 [==============================] - 342s 5s/step - loss: 0.0090 - accuracy: 0.9990 - val_loss: 1.1006 - val_accuracy: 0.9550
Epoch 26/50
63/63 [==============================] - 329s 5s/step - loss: 0.0346 - accuracy: 0.9970 - val_loss: 0.8778 - val_accuracy: 0.9570
Epoch 27/50
63/63 [==============================] - 329s 5s/step - loss: 0.0278 - accuracy: 0.9975 - val_loss: 0.9842 - val_accuracy: 0.9580
Epoch 28/50
63/63 [==============================] - 328s 5s/step - loss: 0.0051 - accuracy: 0.9990 - val_loss: 1.3091 - val_accuracy: 0.9520
Epoch 29/50
63/63 [==============================] - 325s 5s/step - loss: 0.0173 - accuracy: 0.9980 - val_loss: 1.0597 - val_accuracy: 0.9550
Epoch 30/50
63/63 [==============================] - 321s 5s/step - loss: 0.0132 - accuracy: 0.9985 - val_loss: 1.1199 - val_accuracy: 0.9560
Epoch 31/50
63/63 [==============================] - 319s 5s/step - loss: 0.0285 - accuracy: 0.9970 - val_loss: 1.2835 - val_accuracy: 0.9540
Epoch 32/50
63/63 [==============================] - 318s 5s/step - loss: 0.0419 - accuracy: 0.9975 - val_loss: 1.0156 - val_accuracy: 0.9480
Epoch 33/50
63/63 [==============================] - 330s 5s/step - loss: 0.0086 - accuracy: 0.9990 - val_loss: 0.9055 - val_accuracy: 0.9620
Epoch 34/50
63/63 [==============================] - 327s 5s/step - loss: 0.0146 - accuracy: 0.9990 - val_loss: 0.9424 - val_accuracy: 0.9570
Epoch 35/50
63/63 [==============================] - 326s 5s/step - loss: 0.0113 - accuracy: 0.9995 - val_loss: 0.8861 - val_accuracy: 0.9640
Epoch 36/50
63/63 [==============================] - 339s 5s/step - loss: 4.8828e-04 - accuracy: 0.9995 - val_loss: 0.9661 - val_accuracy: 0.9630
Epoch 37/50
63/63 [==============================] - 328s 5s/step - loss: 0.0222 - accuracy: 0.9985 - val_loss: 1.1668 - val_accuracy: 0.9520
Epoch 38/50
63/63 [==============================] - 329s 5s/step - loss: 0.0059 - accuracy: 0.9995 - val_loss: 1.0634 - val_accuracy: 0.9580
Epoch 39/50
63/63 [==============================] - 339s 5s/step - loss: 2.1523e-08 - accuracy: 1.0000 - val_loss: 1.0646 - val_accuracy: 0.9580
Epoch 40/50
63/63 [==============================] - 327s 5s/step - loss: 0.0215 - accuracy: 0.9980 - val_loss: 1.4480 - val_accuracy: 0.9530
Epoch 41/50
63/63 [==============================] - 338s 5s/step - loss: 0.0402 - accuracy: 0.9980 - val_loss: 1.4700 - val_accuracy: 0.9500
Epoch 42/50
63/63 [==============================] - 342s 5s/step - loss: 0.0242 - accuracy: 0.9980 - val_loss: 0.9304 - val_accuracy: 0.9560
Epoch 43/50
63/63 [==============================] - 327s 5s/step - loss: 0.0224 - accuracy: 0.9985 - val_loss: 1.3972 - val_accuracy: 0.9480
Epoch 44/50
63/63 [==============================] - 333s 5s/step - loss: 0.0068 - accuracy: 0.9985 - val_loss: 0.9288 - val_accuracy: 0.9560
Epoch 45/50
63/63 [==============================] - 331s 5s/step - loss: 0.0068 - accuracy: 0.9995 - val_loss: 0.9100 - val_accuracy: 0.9600
Epoch 46/50
63/63 [==============================] - 337s 5s/step - loss: 0.0116 - accuracy: 0.9985 - val_loss: 0.8512 - val_accuracy: 0.9620
Epoch 47/50
63/63 [==============================] - 340s 5s/step - loss: 0.0046 - accuracy: 0.9995 - val_loss: 1.0673 - val_accuracy: 0.9610
Epoch 48/50
63/63 [==============================] - 327s 5s/step - loss: 0.0736 - accuracy: 0.9975 - val_loss: 1.0948 - val_accuracy: 0.9550
Epoch 49/50
63/63 [==============================] - 330s 5s/step - loss: 0.0012 - accuracy: 0.9995 - val_loss: 1.0798 - val_accuracy: 0.9550
Epoch 50/50
63/63 [==============================] - 329s 5s/step - loss: 0.0022 - accuracy: 0.9995 - val_loss: 1.0196 - val_accuracy: 0.9550

Explain:
Observing the result:
The training accuracy rate starts with 0.8870(approximately 89%) then increases to 0.9995(approximately 100%) during epoch 50 which suggest that the model was getting better in classifying the training dataset. Also noticed that at epoch 39 the training accuracy was 100%.
The validation accuracy was maximum at epoch 35 with a value of 0.9640 which suggest that the model performs very well in classifying the validation dataset.
The training loss was fluctuating that is increasing and decreasing with time throughout.
The validation loss was lowest at epoch 3 with the value 0.6793 which means the model is overfitting at this point.

InΒ [80]:
# Extract the metrics from the history
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)

# figure size for the subplots
plt.figure(figsize=(12, 5))

# Plot the Accuracy
plt.subplot(1, 2, 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()

# Plot the loss
plt.subplot(1, 2, 2)
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()

plt.tight_layout()
plt.show()
No description has been provided for this image

Explain Observing this visualize graph we could see around epoch 3 was where the validation loss is the lowest.

4. Explore the relative performance of the modelsΒΆ

InΒ [81]:
# Convert test dataset to NumPy arrays
test_images_list = []
test_labels_list = []

for images, labels in test_dataset:  
    test_images_list.append(images.numpy())
    test_labels_list.append(labels.numpy())

test_images = np.concatenate(test_images_list)
test_labels = np.concatenate(test_labels_list)

4.1 AccuracyΒΆ

Accuracy for the CNN Model

InΒ [82]:
test_model = keras.models.load_model("./models/convnet_from_scratch.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)

# Print the Accuracy for the CNN Model
print(f"Test accuracy: {test_acc:.3f}")
63/63 [==============================] - 21s 303ms/step - loss: 0.5636 - accuracy: 0.7220
Test accuracy: 0.722

Accuracy for the VGG16 Model

InΒ [83]:
test_model = keras.models.load_model( "./models/vgg16_model.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)

# Print the Accuracy for the VGG16 Model
print(f"Test accuracy: {test_acc:.3f}")
63/63 [==============================] - 196s 3s/step - loss: 0.6315 - accuracy: 0.9595
Test accuracy: 0.960

Conclusion: The VGG16 model with a test accuracy of 96.0% perform better compared to the CNN Model with a test accuracy of 72.2%

4.2 Confusion metricΒΆ

Confusion metric for the CNN Model

InΒ [84]:
# Load the saved models 
cnn_model = load_model('./models/cnn_from_scratch.keras')

# Make predictions 
cnn_predictions = cnn_model.predict(test_images)
cnn_predictions = (cnn_predictions > 0.5).astype("int32").flatten()

# Plot the confusion matrix
cnn_cm = confusion_matrix(test_labels, cnn_predictions)
plt.figure(figsize=(6, 4))
sns.heatmap(cnn_cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklabels=class_names)
plt.title("CNN Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()
63/63 [==============================] - 17s 258ms/step
No description has been provided for this image

Confusion metric for the VGG16 Model

InΒ [85]:
# Load the saved model (VGG16 with feature extraction) ---
vgg_model = load_model('./models/vgg16_model.keras')

# Get predictions from the model 
vgg_predictions = vgg_model.predict(test_images)
vgg_predictions = (vgg_predictions > 0.5).astype("int32").flatten()

# Confusion Matrix 
vgg_cm = confusion_matrix(test_labels, vgg_predictions)
plt.figure(figsize=(6, 4))
sns.heatmap(vgg_cm, annot=True, fmt='d', cmap='Greens', xticklabels=class_names, yticklabels=class_names)
plt.title("VGG16 Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()
63/63 [==============================] - 233s 4s/step
No description has been provided for this image

Conclusion: From the confusion matrix of the CNN Model, we could see that the model misclassified 318 cats as dogs and 268 dogs as cats while the confusion Matrix of the VGG16 Model shows that the model misclassified 31 cats as dogs and 50 dogs are cats. From these result we could say that the VGG16 Model did better in classifying the dogs and cats compared to the CNN Model.

4.3 Precision, Recall and F1-scoreΒΆ

Precision, Recall and F1-score for the CNN Model

InΒ [86]:
# CNN Evaluation
print("CNN Classification Report:")
print(classification_report(test_labels, cnn_predictions, target_names=class_names))
CNN Classification Report:
              precision    recall  f1-score   support

         cat       0.72      0.68      0.70      1000
         dog       0.70      0.73      0.71      1000

    accuracy                           0.71      2000
   macro avg       0.71      0.71      0.71      2000
weighted avg       0.71      0.71      0.71      2000

Precision, Recall and F1-score for the VGG16 Model

InΒ [87]:
print("VGG16 Classification Report:")
print(classification_report(test_labels, vgg_predictions, target_names=class_names))
VGG16 Classification Report:
              precision    recall  f1-score   support

         cat       0.95      0.97      0.96      1000
         dog       0.97      0.95      0.96      1000

    accuracy                           0.96      2000
   macro avg       0.96      0.96      0.96      2000
weighted avg       0.96      0.96      0.96      2000

Conclusion
For Cats:

  • Precision: from the classification report the precision for the CNN Model was 72% and for the VGG16 Model was 95% which suggests that the VGG16 Model has better abilty compare to the CNN Model to identify cats.
  • Recall: from the classification report the recall for the CNN Model was 68% and for the VGG16 Model was 97% which suggests that the VGG16 Model is a better model compared to the CNN Model in classifying actuall cats.
  • F1-Score: from the classification report the f1-score for the CNN Model was 70% and for the VGG16 Model was 96% which suggests that the VGG16 Model has more balance between the recall and the precision compared to the CNN Model leading the better performance overall.

For Dogs:

  • Precision: from the classification report the precision for the CNN Model was 70% and for the VGG16 Model was 97% which suggests that the VGG16 Model has has better abilty compare to the CNN Model to identify dogs.
  • Recall: from the classification report the recall for the CNN Model was 73% and for the VGG16 Model was 95% which suggests that the VGG16 Model has a better model compared to the CNN Model in classifying actuall dogs.
  • F1-Score: from the classification report the f1-score for the CNN Model was 71% and for the VGG16 Model was 96% which suggests that the VGG16 Model has more balance between the recall and the precision compared to the CNN Model leading the better performance overall.

In conclusion, from the classification report we could say that the VGG16 Model performs better compare to the CNN Model in the classification of cats and dogs.

InΒ [88]:
# Precision-Recall Curve for CNN Model
cnn_probabilities = cnn_model.predict(test_images).flatten()
precision_cnn, recall_cnn, _ = precision_recall_curve(test_labels, cnn_probabilities)
plt.plot(recall_cnn, precision_cnn, label="CNN")

# Precision-Recall Curve for VGG16 Model
vgg_probabilities = vgg_model.predict(test_images).flatten()
precision_vgg, recall_vgg, _ = precision_recall_curve(test_labels, vgg_probabilities)
plt.plot(recall_vgg, precision_vgg, label="VGG16")

# Plot 
plt.title("Precision-Recall Curve")
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.legend()
plt.grid(True)
plt.show()
63/63 [==============================] - 19s 297ms/step
63/63 [==============================] - 273s 4s/step
No description has been provided for this image

Conclusion:

  • The VGG16 model maintains high precision levels throughout all recall levels which demonstrates its capability to detect positives with minimal errors. The model achieves 1.0 recall fast because it has a strong capability to detect correct positive instances.
  • The CNN model displays a fast decrease in precision while recall levels increases which demonstrates its poor ability in reducing false positive errors when predicting positive cases.
  • In conclusion, the CNN model struggles to achieve high accuracy as recall increases, but the VGG16 model does a far better job of striking a balance between precision and recall.

4.5 Explore specific examples in which the model failed to predict correctlyΒΆ

InΒ [89]:
# Show wrong predictions from CNN 
cnn_wrong_indices = np.where(cnn_predictions != test_labels)[0]

# Set up a 3x3 grid to display images
plt.figure(figsize=(10, 10))

# Display the some examples where the CNN model made wrong predictions:
for i in range(9):
    idx = cnn_wrong_indices[i]
    plt.subplot(3, 3, i+1)
    plt.imshow(test_images[idx].astype("uint8"))
    plt.title(f"CNN - Predicted: {class_names[cnn_predictions[idx]]}, Actual: {class_names[test_labels[idx]]}")
    plt.axis('off')

plt.tight_layout()
plt.show()
No description has been provided for this image

Conculsion: Looking at these images the CNN Model should have classified them better because the cats and dogs in these images have distinct feature that could help classify them correctly.

InΒ [90]:
# Display where the model made a wrong predictions from VGG16 
vgg_wrong_indices = np.where(vgg_predictions != test_labels)[0]

# Set up a 3x3 grid to display images
plt.figure(figsize=(10, 10))

# Display the some examples where the CNN model made wrong predictions:
for i in range(9):
    idx = vgg_wrong_indices[i]
    plt.subplot(3, 3, i+1)
    plt.imshow(test_images[idx].astype("uint8"))
    plt.title(f"VGG16 - Predicted: {class_names[vgg_predictions[idx]]}, Actual: {class_names[test_labels[idx]]}")
    plt.axis('off')

plt.tight_layout()
plt.show()
No description has been provided for this image

Conclusion: Looking at these images could say that the VGG16 model incorrectly identified these images due to low image quality, unclear stances, comparable appearances between cats and dogs, and deceptive contextual elements like background or animals being held together.

Final ConclusionΒΆ

In conclusion, The VGG16 pre-trained model and the CNN model were both successful in image classification; however, the VGG16 model, which used transfer learning, generally produced better results in terms of accuracy and generalization, while the CNN model, although capable of performing well, needs more computational resources and a larger amount of training data to match the performance of the transfer learning approach. This implies that, for similar image classification tasks, using pre-trained models can greatly cut down on training time and enhance model performance, particularly when the dataset is limited.